A Clustering Analysis Method With High Reliability Based on Wilcoxon-Mann-Whitney Testing

نویسندگان

چکیده

As a core step in clustering analysis, distance measurement results can influence accuracy. Existing methods are mostly based on cluster feature information. However, these features may be insufficient and result losing data information for clusters containing number of objects. To improve accuracy, we make full use the distribution characteristics objects clusters, i.e., descriptive statistics Wilcoxon-Mann-Whitney rank sum test nonparametric to measure distances during clustering. Furthermore, propose two-stage algorithm analysis performance. In terms avoiding preliminarily assuming with proposed method, discover arbitrary shapes Experiments multiple datasets compared other algorithms illustrate accuracy efficiency algorithm.

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

When t-tests or Wilcoxon-Mann-Whitney tests won't do.

t-Tests are widely used by researchers to compare the average values of a numeric outcome between two groups. If there are doubts about the suitability of the data for the requirements of a t-test, most notably the distribution being non-normal, the Wilcoxon-Mann-Whitney test may be used instead. However, although often applied, both tests may be invalid when discrete and/or extremely skew data...

متن کامل

Combinatorics, Computer Algebra and Wilcoxon-mann-whitney Test

We show the combinatorics behind the Wilcoxon-Mann-Whitney two-sample test. This yields new combinatorial proofs of recurrences for its null distribution given recently by Brus and Chang, as well as new recurrences. It is shown how to convert these recurrences into generating functions. These generating functions are used to obtain closed expressions for the null distribution when one of the sa...

متن کامل

An improved ranked set two-sample Mann-Whitney-Wilcoxon test

The authors present an improved ranked set two-sample Mann-Whitney-Wilcoxon test for a location shift between samples from two distributions F and G. They define a function that measures the amount of information provided by each observation from the two samples, given the actual joint ranking of all the units in a set. This information function is used as a guide for improving the Pitman effic...

متن کامل

Optimizing Classi er Performance Via the Wilcoxon-Mann-Whitney Statistic

Cross entropy and mean squared error are typical cost functions used to optimize classi er performance. The goal of the optimization is usually to achieve the best correct classi cation rate. However, for many two-class real-world problems, the ROC curve is a more meaningful performance measure. We demonstrate that minimizing cross entropy or mean squared error does not necessarily maximize the...

متن کامل

The Mann Whitney Wilcoxon Distribution Using Linked Lists

We give an improved algorithm for calculating the exact null distribution of the two sample Mann Whitney Wilcoxon rank sum statistic. The algorithm modifies the update method of Smid using a minimal linked list which directs calculation of only those intermediate probabilities required for the final value. Using an efficient shortened representation of the list of required intermediate values, ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: IEEE Access

سال: 2021

ISSN: ['2169-3536']

DOI: https://doi.org/10.1109/access.2021.3053244